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DETAILED ACTION 

1 . Claims 1-22 have been examined. 

Claim Rejections - 35 USC § 103 

The following is a quotation of 35 U.S.C. 103(a) which forms the basis for all 
obviousness rejections set forth in this Office action: 

(a) A patent may not be obtained though the invention is not identically disclosed or described as set 
forth in section 102 of this title, if the differences between the subject matter sought to be patented and 
the prior art are such that the subject matter as a whole would have been obvious at the time the 
invention was made to a person having ordinary skill in the art to which said subject matter pertains. 
Patentability shall not be negatived by the manner in which the invention was made. 

This application currently names joint inventors. In considering patentability of 

the claims under 35 U.S.C. 103(a), the examiner presumes that the subject matter of 

the various claims was commonly owned at the time any inventions covered therein 

were made absent any evidence to the contrary. Applicant is advised of the obligation 

under 37 CFR 1 .56 to point out the inventor and invention dates of each claim that was 

not commonly owned at the time a later invention was made in order for the examiner to 

consider the applicability of 35 U.S.C. 103(c) and potential 35 U.S.C. 102(e), (f) or (g) 

prior art under 35 U.S.C. 103(a). 

2. Claim 1-10 and 14-22 are rejected under 35 U.S.C. 103(a) as being unpatentable 
over DaCosta et al. (U.S. Patent Number 6826553) in view of Gardner (U.S. Patent 
Application Publication Number 2003/0177112). 

Referring to claim 1 , DaCosta et al. teaches a method for extracting an attribute 
occurrence (DaCosta et al., Figure 1, Figure 3, and Column 5 Line 37-54) from template 
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generated semi-structured document (DaCosta et al., Column 1 1 Line 33-50, Column 6 
Line 14-31, i.e. "stock quotes or weather data" and Column 12 Line 5-14) comprising 
multi-attribute data records comprising (DaCosta et al., Column 7 Line 55 through 
Column 8 Line 18, i.e. "Referring first to form variations, XML encoding of a form 
provides key-value pairs of form parameters."): 

identifying a first set of attribute occurrences in the template generated semi- 
structured document (DaCosta et al., Column 5 Line 55-67, i.e. "Artificial Intelligence 
(Al) techniques can be utilized to enable pattern matching to sure that the relevant 
information will still be retrieved even if the page is modified", Column 6 Line 1-14, i.e., 
"..triangulates on these attributes (structure, content and formatting) to find and lock on 
the target data, and Column 6 Line 14-32, "extraction instructions"); 

learning a pattern for an attribute corresponding to an identified attribute 
occurrence of the first set in the template generated semi-structured document 
(DaCosta et al. Column 6 Line 14-32 "capable of performing some degree of learning); 
and 

applying the pattern in the template generated semi-structured document to 
extract a second set of attribute occurrences (DaCosta et al. Column 6 Line 14-32, i.e., " 
the extraction module 20 infers extraction rules and applies them to the remainder of the 
data in the web page or other web-based accessible document."). 

However, DaCosta et al. does not explicitly disclose that said method uses an 
ontology in identifying a set of attribute occurrences and that a boundary of each multi- 
attribute data record are determined. On the other hand, Gardner teaches an ontology- 
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based information management system and method wherein ontology is used to identify 
a set of attributes (Gardner Paragraph 0048, i.e. "using an ontological approach", 
Paragraph 0049, and Paragraph 0051 ) and a boundary of each multi-attribute data 
record in the template generated semi-structured document (Gardner Paragraph 0107, 
i.e. "pair wise distances between ontology terms measured using different distance 
measure." and Paragraph 0017-0019). 

At the time the invention was made, it would have been obvious to a 
person of ordinary skill in the art to combine the ontology-based information 
management system and method of Gardner with the method and system for providing 
database functions for multiple Internet sources as taught by DaCosta et al. so that the 
combined method and system would identify a set of attribute occurrences in template- 
generated documents, determine a boundary of each multi-attribute data record in the 
said documents and employ the pattern of attributes within the said boundary of each 
multi-attribute data record in the template generated semi-structured document to 
extract a second set of attribute occurrences. One would have been motivated to do so 
in order to provide "a particularly well-developed set of semantic relationships built into 
the ontology" (Gardner Paragraph 0008). 

Referring to claim 2, DaCosta et al. in view of Gardner as applied above with 
regard to claim 1 discloses the invention as claimed. DaCosta et al. in view of Gardner 
is directed to the method for claim 1, further comprising the step of providing a seed 
ontology ("search results of the ontology search") prior to identifying the first set of 
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attribute occurrences (Gardner Paragraph 0089-0096 and " "ontology-based search" 
and Paragraph 01 14). 

Referring to claim 3, DaCosta et al. in view of Gardner as applied above with 
regard to claim 1 discloses the invention as claimed. DaCosta et al. in view of Gardner 
is directed to the method for claim 1 , wherein the ontology is one of a seed ontology 
(Gardner Paragraph 0091 , i.e., "one or more search strings" of the ontological search") 
and an enriched ontology (Paragraph 0092-0093, i.e. "the search results of the 
ontological search" and Paragraph 0100). 

Referring to claim 4, DaCosta et al. in view of Gardner as applied above with 
regard to claim 1 discloses the invention as claimed. DaCosta et al. in view of Gardner 
is directed to the method for claim 1 , further comprising enriching the ontology with the 
second set of attributes occurrences (Gardner Paragraph 0105, i.e. "The system may, in 
operation 624, augment an existing ontology with newly discovered terms to maximize 
the discriminable term coverage."). 

Referring to claim 5, DaCosta et al. in view of Gardner as applied above with 
regard to claim 1 discloses the invention as claimed. DaCosta et al. in view of Gardner 
is directed to the method for claim 1, wherein the pattern is a path abstraction 
expression, wherein the path abstraction expression is a regular expression that does 
not comprise a union operator, and a closure operator only applies to single symbols 
(DaCosta et al., Column 7 Line 29-54, i.e., "the recorded path", DaCosta et al., Column 
9 Line 45-52,). 
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Referring to claim 6, DaCosta et al. in view of Gardner as applied above with 
regard to claim 1 discloses the invention as claimed. DaCosta et al. in view of Gardner 
is directed to the method for claim 1 .wherein learning the pattern for each attribute 
occurrence comprises: identifying the attribute occurrence in a data structure tree 
(DaCosta et al., Column 1 2 Line 5-14 "the system infers that the user wishes to further 
extract information 1 160 using this parental hierarchy information"); and determining the 
pattern of the attribute occurrence in the data structure tree (DaCosta et al., Column 12 
Line 5-61, i.e. "parental hierarchy information" 1130-1140-1150 and "Using Al 
techniques, the system of the present invention evaluates parental information 1 130- 
1140-1150 and 1170-1180-1190 to determine a most likely pattern that the highlighted 
element 1 150 and element 1 1 60 cab be classified as satisfying and then finds the next 
element which matches that pattern, if any"). 

Referring to claim 7, DaCosta et al. in view of Gardner as applied above with 
regard to claim 6 discloses the invention as claimed. DaCosta et al. in view of Gardner 
is directed to the method for claim 6, further comprising the step of generalizing the 
pattern of the attribute occurrence prior to applying the pattern (Gardner Paragraph 
0105-0106, i.e., "the system may combine multiple available ontologies into a single 
validated aggregate ontology".). 

Referring to claim 8, DaCosta et al. in view of Gardner as applied above with 
regard to claim 6 discloses the invention as claimed. DaCosta et al. in view of Gardner 
is directed to the method for claim 6, wherein the pattern comprises elements including 
a location and a format of the attribute occurrence (DaCosta et al., Column 6 Line 1-13, 
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" a snippet of relevant information on a web page or other web-accessible document 
contains structural, contents and formatting attributes"). 

Referring to claim 9, DaCosta et al. in view of Gardner as applied above with 
regard to claim 8 discloses the invention as claimed. DaCosta et al. in view of Gardner 
is directed to the method for claim 8, wherein the elements are nodes in the data 
structure tree (DaCosta et al., Column 12 Line 5-42, i.e. "the lineage or ancestry tree 
matches" and "Using Al techniques, the system of the present invention evaluates 
parental information 1130-1140-1150 and 1170-1180-1190 to determine a most likely 
pattern that the highlighted element 1150 and element 1160 cab be classified as 
satisfying and then finds the next element which matches that pattern, if any"). 

Referring to claim 10, DaCosta et al. in view of Gardner as applied above with 
regard to claim 7 discloses the invention as claimed. DaCosta et al. in view of Gardner 
is directed to the method of claim 7, further comprising resolving the ambiguities in the 
extracted attribute occurrences (DaCosta et al., Column 9 Line 45-52, i.e. "extraction 
module" and Figure 15) comprising: 

identifying attribute occurrences in the template generated semi-structured 
document matching more than one pattern (DaCosta et al., Column 5 Line 63-67, i.e. 
"Artificial Intelligence (Al) techniques can be utilized to enable pattern matching to 
ensure that the relevant information will still be retrieved"). Note that it is inherent that 
pattern occurrences are identified in any Al pattern matching; 

determining a pattern that uniquely matches a given attribute occurrence and no 
other pattern uniquely matches the given attribute occurrence (DaCosta et al., Column 5 
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Line 63-67, i.e. "Artificial Intelligence (Al) techniques can be utilized to enable pattern 
matching to ensure that the relevant information will still be retrieved"). Note that it is 
inherent in any Al pattern matching that a pattern that uniquely matches a given 
occurrence is determined and 

eliminating matches between the given attribute occurrence and another pattern 
that matches the given attribute occurrence and at least one other attribute occurrence 
(DaCosta et al., Column 5 Line 63-67, i.e. "Artificial Intelligence (Al) techniques can be 
utilized to enable pattern matching to ensure that the relevant information will still be 
retrieved"). Note that it is inherent in any Al pattern occurrences that are not unique are 
eliminated. 

Referring to claim 14, DaCosta et al. in view of Gardner as applied above with 
regard to claim 1 discloses the invention as claimed. DaCosta et al. in view of Gardner 
is directed to the method for claim 1 , wherein determining the boundary of each multi- 
attribute data record comprises: 

providing a tree of a page (DaCosta et al., Column 5 Line 23-29, i.e. "the 
Document Object Model (DOM) of an document and DaCosta et al., Column 12 Line 
5-1 4 "the system infers that the user wishes to further extract information 1 1 60 using 
this parental hierarchy information") and a set of attribute names of a concept of the 
ontology (DaCosta et al., Figure 1 , Figure 3, and Column 5 Line 37-54 and Gardner 
Paragraph 0048, i.e. "using an ontological approach", Paragraph 0049, and Paragraph 
0051); 
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marking a node in the tree by a set of attributes present in a subtree rooted at the 
node (DaCosta et al., Column 12 Line 5-42, i.e. "the lineage or ancestry tree matches" 
and "Using Al techniques, the system of the present invention evaluates parental 
information 1 1 30-1 1 40-1 1 50 and 1 1 70-1 1 80-1 1 90 to determine a most likely pattern 
that the highlighted element 1150 and element 1160 cab be classified as satisfying and 
then finds the next element which matches that pattern, if any"); 

determining a set of maximally marked nodes in the tree (Gardner Paragraph 
0107, i.e. "pair wise distances between ontology terms measured using different 
distance measure.", Paragraph 0105, i.e. "establish a maximum set of ontological 
terms", and Paragraph 0017-0019); 

determining a page type (Gardner Paragraph 0093-0094 and 0089, i.e., A user's 
background/profile is determined to determine the kind of page to be searched. ); and 

extracting a boundary according to the page type (Gardner Paragraph 0104). 

Referring to claim 1 5, DaCosta et al. in view of Gardner as applied above with 
regard to claim 14 discloses the invention as claimed. DaCosta et al. in view of Gardner 
is directed to the method for claim 14, wherein the page type is one of a home page and 
a referral page (Gardner Paragraph 0089). Note that the method of Gardner could 
identify a home page depending on the background/profile of the user and, inherently 
could refer to other pages. 

Referring to claim 16, DaCosta et al. in view of Gardner as applied above with 
regard to claim 14 discloses the invention as claimed. DaCosta et al. in view of Gardner 
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is directed to the method for claim 14, wherein extracting the boundary further 
comprises: 

determining a maximally marked node with a highest score among the set of 
maximally marked nodes in the tree (Paragraph 0105, i.e. "establish a maximum set of 
ontological terms") ; 

determining whether the tree comprises a single-valued attribute (DaCosta et al., 
Column 5 Line 23-29, i.e. "the Document Object Model (DOM) of an document .Gardner 
Paragraph 0005, "An ontology term may be a single named concept describing an 
object or entity", Paragraph 0051, "An ontology may be used to enable effective 
syntactic and semantic mapping between any number of different entities..", DaCosta 
Figure 9 wherein single-value/multi-value attributes are presented in the record table, 
and DaCosta et al. Column 11 Line 61 through Column 12 Line 5); 

determining values of the single-marked attribute upon determining the single- 
valued attribute (DaCosta Figure 9 wherein single-value/multi-value attributes are 
presented in the record table and DaCosta et al. Column 1 1 Line 61 through Column 
12 Line 5); 

determining whether the tree comprises a multiple-valued attribute (DaCosta et 
al., Column 5 Line 23-29, i.e. "the Document Object Model (DOM) of an document , 
DaCosta Figure 9 wherein single-value/multi-value attributes are presented in the 
record table, and DaCosta et al. Column 11 Line 61 through Column 12 Line 5); and 

determining values of the multiple-marked attribute upon determining the 
multiple-valued attribute (DaCosta Figure 9 wherein single-value/multi-value attributes 
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are presented in the record table, and DaCosta et al. Column 1 1 Line 61 through 
Column 12 Line 5). 

Referring to claim 17, DaCosta et al. in view of Gardner as applied above with 
regard to claim 1 discloses the invention as claimed. DaCosta et al. in view of Gardner 
is directed to a method for enriching an adaptive search engine comprising: 

providing one of a seed ontology (Gardner Paragraph 0091, i.e., "one or more 
search strings" of the ontological search") and an enriched ontology (Paragraph 0092- 
0093, i.e. "the search results of the ontological search" and Paragraph 0100), the 
ontology comprising a set of concepts and a set of attributes associated with every 
concept (Gardner Paragraph 0005, "An ontology term may be a single named concept 
describing an object or entity", Paragraph 0051 , "An ontology may be used to enable 
effective syntactic and semantic mapping between any number of different entities.."); 

determining an attribute identifier for a document of interest DaCosta et al., 
Column 5 Line 55-67, i.e. "Artificial Intelligence (Al) techniques can be utilized to enable 
pattern matching to sure that the relevant information will still be retrieved even if the 
page is modified", Column 6 Line 1-14, i.e., "..triangulates on these attributes (structure, 
content and formatting) to find and lock on the target data, and Column 6 Line 14-32, 
"extraction instructions"); and 

adding the attribute identifier to the ontology for identifying attribute occurrences 
in at least the document of interest (Gardner Paragraph 0050-0053). 

Referring to claim 18, DaCosta et al. in view of Gardner as applied above with 
regard to claim 17 discloses the invention as claimed. DaCosta et al. in view of Gardner 
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is directed to a method of claim 17, wherein determining the attribute identifier further 
comprises: 

determining a methodology of the attribute identifier (Gardner, Paragraph 0049- 
0053); and 

determining a set of parameter values to be used by the methodology (Gardner, 
paragraph 0073, 0089, and 0096). 

Claim 19 is rejected on the same basis as claim 1. 

Referring to claim 20, DaCosta et al. in view of Gardner as applied above with 
regard to claim 1 discloses the invention as claimed. DaCosta et al. in view of Gardner 
is directed to an adaptive search engine appliance for searching a database of multi- 
attribute data records in a template generated semi-structured document, the search 
engine appliance comprising: 

an ontology for identifying a first set of attribute occurrences in the template 
generated semi-structured document, the ontology comprising a set of 
concepts and a set of attributes associated with every concept (Gardner Paragraph 
0005, "An ontology term may be a single named concept describing an object or entity", 
Paragraph 0051 , "An ontology may be used to enable effective syntactic and semantic 
mapping between any number of different entities.."); 

a boundary module for determining a boundary of each multi-attribute data 
record in the template generated semi-structured document (Gardner Paragraph 0107, 
i.e. "pair wise distances between ontology terms measured using different distance 
measure." and Paragraph 0017-0019); and 
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a pattern module for learning a pattern for an attribute corresponding to an 
identified attribute occurrence of the first set in the template 
generated semi-structured document (DaCosta et al., Column 5 Line 55-67, i.e. 
"Artificial Intelligence (Al) techniques can be utilized to enable pattern matching to sure 
that the relevant information will still be retrieved even if the page is modified", Column 
6 Line 1-14, i.e., "..triangulates on these attributes (structure, content and formatting) to 
find and lock on the target data, and Column 6 Line 14-32, "extraction instructions"). 

Referring to claim 21 , DaCosta et al. in view of Gardner as applied above with 
regard to claim 20 discloses the invention as claimed. DaCosta et al. in view of Gardner 
is directed to the adaptive search engine of claim 20, wherein the pattern is applied 
within the boundary of each multi-attribute data record in the template generated semi- 
structured document to extract a second set of attribute occurrences (DaCosta et al. 
Column 6 Line 14-32, i.e., " the extraction module 20 infers extraction rules and applies 
them to the remainder of the data in the web page or other web-based accessible 
document." and Gardner Paragraph 0107, i.e. "pair wise distances between ontology 
terms measured using different distance measure." and Paragraph 0017-0019). 

Referring to claim 22, DaCosta et al. in view of Gardner as applied above with 
regard to claim 20 discloses the invention as claimed. DaCosta et al. in view of Gardner 
is directed to the adaptive search engine of claim 20, wherein the database of multi- 
attribute data records is stored on a server connected to the adaptive search engine 
application across a communications network (DaCosta et al, Figure 22 "Smart Server" 
and "Web Folders" and Column 18 Line 8-25). 
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3. Claim 11 and 13 are rejected under 35 U.S.C. 103(a) as being unpatentable over 
DaCosta et al. in view of Gardner and further in view of Oommen (U.S. Patent 
Application Publication Number 2003/0195890). 

Referring to claim 1 1 , DaCosta et al. in view of Gardner as applied to claim 1 
above does not explicitly disclose that the learning pattern for an attribute corresponding 
to an identified attribute occurrence of the first set in the template generated semi- 
structured document comprises learning positive examples of the attribute and learning 
negative examples of the attribute. However, Oommen teaches a method and system 
for comparing the closeness of a target tree to other trees wherein learning pattern 
comprises learning positive examples and negative examples (Oommen, Paragraph 
00198, Paragraph 0078-0085, and Paragraph 0116-0117, i.e., "If the test in block 800 
returns a negative answer" and ""If the test in block 800 returns a positive answer"). 

At the time the invention was made, it would have been obvious to a person of 
ordinary skill in the art to combine the feature of comparing closeness of trees as taught 
by Oommen to the method and system of DaCosta et al. in view of Gardner as applied 
to claim 1 above so that, in the combined method and system, the learning pattern 
would comprise learning positive examples of the attribute and learning negative 
examples of the attribute. One would have been motivated to do so in order to provide 
solution for "all string, substring and subsequence recognition algorithms" as well as " a 
solution to all tree, subtree and subsequence tree recognition problems (Oommen, 
Paragraph 0061). 
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Referring to claim 13, DaCosta et al. in view of Gardner and further in view of 
Oommen as applied above with regard to claim 1 1 discloses the invention as claimed. 
DaCosta et al. in view of Gardner and further in view of Oommen is directed to the 
method for claim 1 , wherein learning the pattern for an attribute corresponding to an 
identified attribute occurrence of the first set in the template generated semi-structured 
document comprises learning negative examples of the attribute, wherein the negative 
examples are positive examples of other attributes (Oommen, Paragraph 00198). In the 
method of Oommen, it is inherent that if a positive answer is returned for an occurrence, 
a negative answer will be returned for a different occurrence. 

4. Claim 12 is rejected under 35 U.S.C. 103(a) as being unpatentable over DaCosta 
et al. in view of Gardner and further in view of Oommen and further in view of Bruno 
("Efficient Creation of Statistics over Query Expressions" by Nicolas Bruno and Surajit 
Chaudhuri, ICDE 2003: Bangalore, India, March 5-8 2003, 
http://www.cs.brown.edu/courses/cs227/Papers/AutoAdmin/buildsits.pdf) 

Referring to claim 12, DaCosta et al. in view of Gardner and further in view 
Oommen as applied to claim 1 1 teaches that common supersequence for identified 
attribute occurrences are determined (Oommen, Paragraph 0316, i.e. "The method of 
the invention can search for the pattern where the pattern sought is distributed over a 
larger supersequence as ). However, DaCosta et al. in view of Gardner and further 
in view Oommen as applied to claim 1 1 does not explicitly disclose that a generalized 
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supersequence is determined and that whether a term of the generalized 
supersequence can be degeneralized is determined. 

On the other hand, Bruno teaches a method for generalizing shortest common 
supersequence wherein terms in common supersequence are generalized. (Bruno, 
Page 8 Column 1 Line 6 through Column 2 Line 39). Additionally, it is inherent whether 
a term in a common supersequence can be degeneralized can be determined 
employing this method. At the time the invention was made, it would have been obvious 
to a person of ordinary skill in the art to add the feature of generalizing terms in common 
supersequence as taught by Bruno to the method and system of DaCosta et al. in view 
of Gardner and further in view of Oommen as applied to claim 1 1 so that the resultant 
method and system would comprise determining a common sequence for identified 
attribute occurrences, determine a generalized supersequence, and determine whether 
a term in the generalized supersequence can be degeneralized. One would have been 
motivated to do so in order to "adder the optimization problem" (Bruno, Page 6, Column 
2 Line 10-13). 

Conclusion 

Any inquiry concerning this communication or earlier communications from the 
examiner should be directed to Dennis Myint whose telephone number is (571 ) 272- 
5629. The examiner can normally be reached on 8:30AM-5:30PM Monday-Friday. 
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If attempts to reach the examiner by telephone are unsuccessful, the examiner's 
supervisor, John Breene can be reached on (571) 272-4107. The fax phone number for 
the organization where this application or proceeding is assigned is 571-273-8300. 

Information regarding the status of an application may be obtained from the 
Patent Application Information Retrieval (PAIR) system. Status information for 
published applications may be obtained from either Private PAIR or Public PAIR. 
Status information for unpublished applications is available through Private PAIR only. 
For more information about the PAIR system, see http://pair-direct.uspto.gov. Should 
you have questions on access to the Private PAIR system, contact the Electronic 
Business Center (EBC) at 866-217-9197 (toll-free). 

Dennis Myint AU-2162 
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